Using Similarity Scoring to Improve the Bilingual Dictionary for Sub-sentential Alignment

نویسندگان

  • Katharina Probst
  • Ralf D. Brown
چکیده

We describe an approach to improve the bilingual cooccurrence dictionary that is used for word alignment, and evaluate the improved dictionary using a version of the Competitive Linking algorithm. We demonstrate a problem faced by the Competitive Linking algorithm and present an approach to ameliorate it. In particular, we rebuild the bilingual dictionary by clustering similar words in a language and assigning them a higher cooccurrence score with a given word in the other language than each single word would have otherwise. Experimental results show a significant improvement in precision and recall for word alignment when the improved dicitonary is used.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Aligning linguistically motivated phrases

In this paper, we describe the architecture of a sub-sentential alignment system that links linguistically motivated phrases in parallel texts. We conceive our sub-sentential aligner as a cascade model consisting of two phases. In the first phase, anchor chunks are linked on the basis of lexical correspondences and syntactic similarity. In the second phase, we will focus on the more complex tra...

متن کامل

Using Similarity Scoring To Improve the Bilingual Dictionary for Word Alignment

We describe an approach to improve the bilingual cooccurrence dictionary that is used for word alignment, and evaluate the improved dictionary using a version of the Competitive Linking algorithm. We demonstrate a problem faced by the Competitive Linking algorithm and present an approach to ameliorate it. In particular, we rebuild the bilingual dictionary by clustering similar words in a langua...

متن کامل

Using Punctuations and Lengths for Bilingual Sub-sentential Alignment

We present a new approach to aligning bilingual English and Chinese text at sub-sentential level by interleaving alphabetic texts and punctuations matches. With sub-sentential alignment, we expect to improve the effectiveness of alignment at word, chunk and phrase levels and provide finer grained and more reusable translation memory.

متن کامل

Bilingual phrase-to-phrase alignment for arbitrarily-small datasets

This paper presents a novel system for sub-sentential alignment of bilingual sentence pairs, however few, using readily-available machine-readable bilingual dictionaries. Performance is evaluated against an existing gold-standard parallel corpus where word alignments are annotated, showing results that are a considerable improvement on a comparable system and on GIZA++ performance for the same ...

متن کامل

Interleaving Text and Punctuations for Bilingual Sub-sentential Alignment

We present a new approach to aligning bilingual English and Chinese text at sub-sentential level by interleaving alphabetic texts and punctuations matches. With sub-sentential alignment, we expect to improve the effectiveness of alignment at word, chunk and phrase levels and provide finer grained and more reusable translation memory.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002